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A needlet-based approach to the full-sky data analysis 
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In cosmic-ray physics, large field of view experiments are triggered by a number of signals laying 
on different angular scales: point-like and extended gamma-ray sources, diffuse emissions, as well 
as large and intermediate scale cosmic-ray anisotropics. The separation of all these contributions 
is crucial, mostly when they overlap with each other. Needlets are a form of spherical wavelets 
that have recently drawn a lot of attention in the cosmological literature, especially in connection 
with the analysis of CMB data. Needlets enjoy a number of important statistical and numerical 
properties which suggest that they can be very effective in handling cosmic-ray and gamma-ray data 
analysis. An application of needlets to astroparticle physics is shown here. In particular, light will 
be thrown on how useful they might be for estimating background and foreground contributions. 
Since such an estimation is expected to be optimal or nearly-optimal in a well-defined mathematical 
sense, needlets turn out to be a powerful method for unbiased point-source detections. In this paper 
needlets were applied to two distinct simulated datasets, for satellite and EAS array experiments, 
both large field of view telescopes. Results will be compared to those achievable with standard 
analysis tecniques in any of these cases. 
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Introduction 



Over the last decades very high energy cosmic-ray 
experiments made great headway in sensitivity and 
duty-cycle, providing huge amounts of very high qual- 
ity data. At the present time space telescopes are 
operated with angular resolution as good as few tens 
of arcminutes. Large field of view ground-based tele- 
scopes easily achieve 1 deg capability in reconstruct- 
ing the primary arrival direction. Every day exper- 
imenters cope with problems like resolving sources 
close to each other or separating them from diffuse 
or extended back/foreground or enhancing the signifi- 
cance of the detection, as well as defining the shape of 
extended sources or giving accurate estimation of di- 
rectional anisotropics. For instance, ARGO-YBJ ob- 
served very intense localized excesses of cosmic rays, 
10° — 30° wide The contribution of these regions 
must be carefully considered when looking at gamma- 
ray sources nearby and viceversa. The Cygnus region 
may present similar problems, because many known 
gamma-ray sources are there within few degrees. The 
case of crowded regions is quite common in satellite 
experiments as sensitive as Fermi-LAT, mostly when 
parts of the Galactic plane are considered (think of 
the Vela region or the Galctic center). 

In the framework of the standard analysis tech- 
niques, if the density in a certain direction on the 
sphere is considered as it is^ no information can be 
obtained about the angular scales which the signal 
comes from in that point. To get exactly such infor- 
mation, the harmonic expansion can be used, but in 
this case no localization in the real space can be ob- 
tained anymore. A very good tradeoff is represented 
by the wavelet expansion, where the exactness of the 
harmonic expansion is given up in favour of the pos- 
sibility of having good localization properties in real 



and harmonic domain at the same time. This is the 
reason why a growing interest has been devoted in 
the last five years to the application in a cosmologi- 
cal environment of a new form of spherical wavelets, 
called needlets. Needlets were introduced in the math- 
ematical literature by [2], see also [3| for extensions 
and generalizations; several applications to Cosmic 
Microwave Background data data have also been im- 
plemented: see for instance 0, [5[. More recently, a 
few papers have focussed on the use of needlets to de- 
velop estimators within the thresholding paradigm, in 
the framework of directional data, which provide the 
mathematical formalism for cosmic rays experiments 
[6]. 

In this paper, we shall first review briefly the main 
features of needlets and the explain how their proper- 
ties make them a very promising tool for cosmic rays 
data analysis. To point out how effective they may 
be either for satellite or ground experiments, needlets 
are applied here to two distinct simulated datasets, 
which reproduce the data acquisition of Fermi-LAT 
and ARGO-YBJ. The second section is then aimed 
at describing the simulation. Afterwards, we present 
applications of needlet procedures on the simulated 
dataset. A final section summarizes results and dis- 
cusses perspectives for further research. 



I. NEEDLETS CONSTRUCTION AND MAIN 
PROPERTIES 

Needlets enjoy quite a few important properties 
that make them very suitable for spherical data analy- 
sis. Indeed, needlets do not rely on any tangent plane 
approximation and they are perfectly adapted to stan- 
dard packages; their main property is the capability 
of unfolding directional data in the harmonic space, 
so that every order j of needlets contains power only 
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from within a certain range in the / space. In the 
real space for any fixed angular distance the tail of 
the needlets decay faster than any polynomial, i.e. 
quasi-exponent ially as the frequency increases. It is 
worth noticing that for needlets the exact reconstruc- 
tion property holds, i.e. back-transform exists giving 
the input signal within numerical accuracy. 

We do not give details about needlets construction 
here (on this purpose, see [2, 7]). Nonetheless, we re- 
call that the needlet idea has been extended by Geller 
and Mayeli [3[ with the construction of so called Mex- 
ican needlets, see also [s] for numerical analysis and 
implementation in a cosmological framework. Mexi- 
can needlets have localization properties in real space 
even better than standard needlets, that is why they 
were applied to get the results presented here. 

Numerical codes have been developed by the author 
to compute the needlet estimators /3jk for any sky map 
containing density information. Some other codes are 
publicly available [8] and results reported here have 
been suitably cross-checked. The index j refers to the 
needlet order, while k runs along the pixels into which 
the sphere has been discretized (the Healpix package 
is used here [9], with the "ring" numbering scheme). 

When the needlet transform is numerically com- 
puted, three parameters are important, i.e. jo (the 
first order computed) and rij (how many needlet or- 
ders are computed) and B > 1 which is a number used 
to pass from the j-th order to the {j + l)-th. A golden 
rule to be used is that the order j contains power 
mostly from multipoles in the range [B^~'^^ 5-^+^][l5|. 
If mexican needlets are used, one more parameter is 
important, p, which describes how fast needlets go to 
zero in the harmonic space. 

We shall now consider the analysis of data from cos- 
mic rays observatories, following classical approaches 
to wavelet-based density estimation. An idea that we 
shall pursue is to implement thresholding estimates, 
as discussed for instance by [6]. More precisely, we 
can consider the nonlinear estimate f*{x) obtained 
by suitably weighting the needlet coefficients in the 
back-transform. The weight function Wjk{Pjk) is some 
nonlinear function that "shrinks" beta towards zero. 
Such estimates can be shown to be robust and nearly 
optimal over a wide class of density functions and dif- 
ferent loss functions, i.e. figures of merit by which to 
measure when the estimate is "close" to the density 
to be estimated. 

Hereafter, we will name source-map the sky-map as 
it comes from the experiment; beta-maps those con- 
taining the coefficient estimators Pjk] reconstructed 
maps or hack-transformed maps^ those obtained after 
the application of the inverse needlet transform. 



II. NEEDLET ORDERS AND ANGULAR 
SCALES 

As for any mathematical transform, the effective- 
ness of needlets lies in the chances that they offer 
to the analyst when handling data in the transform 
space. In this sense, it is crucial to understand how 
certain given signals are represented in the transform 
space. Even more, in which part of the transform 
space the highest power of the signal is saved. On this 
purpose, the golden rule outlined in the previous sec- 
tion is an important basic tool. More in detail, many 
figures of merit can be defined to characterize the per- 
formance of the needlet transform, mostly looking at 
the reconstructed signal. Just to give an example, we 
introduce here two quantities which can be evaluated 
for simulated signal from point-like sources: 

• Pr = N^/N;^ ^ where r is a certain angular dis- 
tance from the source and R and S stand for "re- 
constructed" and "sampled" respectively, pr is 
the ratio of the number of events reconstructed 
within r from the source to the number of events 
sampled in the same region; 

• i^r) = ^/EZi [{K''-K')/K']' /nr, where 
the bin index i takes values [l,nr] correspond- 
ing to angular distances from the source (0,r). 
JVf^ and JVf indicate the number of events in 
the i^^ bin for the "reconstructed" and "sam- 
pled" map respectively. (Sr) is the square root 
of the of the reconstructed radial distribution 
to the sampled one, after normalizing it to the 
bin number. Smaller (^^), higher the confidence 
that not only the integral, but also the shape 
of the reconstructed signal reproduces well the 
sampled one. 

To properly characterize the method, we computed 
pr and (Sr) for many combinations of parameters 
(5,jo,nj), for both standard and Mexican needlet, 
as well as for many source extensions and intensities. 
Many detector point spread functions (PSFs) were ac- 
counted for, too. 

By way of example, figure [T] reports schematic views 
of Pr and (Sr) as functions of jo and nj. To realize it, 
we simulated a source emitting 10^ events, assuming 
a gaussian PSF with a = 1° and applied the Mexican 
needlet transform (5 = 1.6, p = 1, jo = 1, = 15). 
To oversimplify the example, no fore/background con- 
tributions are superimposed. The radius chosen to 
evaluate p and {5) was r = 3<j and no background con- 
tribution is taken in consideration. For both plots, the 
first order used to reconstruct the signal is represented 
on the horizontal axis, while the vertical axis number 
of orders used. The color scale represents the values 
of psa- and {63^)- For instance, bins (5,6) contain the 
values of psa- and {63^) obtained from the map recon- 
structed with needlet orders 5 ^ 10. What is easy 
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FIG. 1: Representation of the reconstruction properties of 
the Mexican needlet transform. Upper plot: psa- Lower 
plot: (Ssa). See text for definition and details. 

noticing is that the two plot are somehow conjugate. 
Where the ratio psa is close to 1, the error (S^a) is 
close to and viceversa. It happens in the top-left 
part of the schemes, representing cases where almost 
all the beta-maps computed in the transform are used 
in the reconstruction. Nonetheless a very good re- 
construction can be obtained also with less expensive 
choices: e.g. if orders 5 — 11 are used, more than 95% 
of events will be part of the reconstruction and the 
signal shape will be good too, because the error is less 
than 5%. 



III. /3-MAPS AND SIGNIFICANCE 
ESTIMATION 

To fully exploit the potential of needlet s, analy- 
sis can be carried out in the beta space, i.e. han- 
dling beta-maps. Selections and operations can be 
made there, after which data may be backtansformed 
if needed. As often in the real space the analysis is per- 
formed on the basis of what the experimenter expects 
in case the signal is not there (null detection hypothe- 
sis), all the same knowing how beta-maps would look 
like in absence of signal is essential. 

If there exists a reliable way of simulating how the 
sky would appear without the signal under study, an 
estimation of the average and the r.m.s. of the /3jk 
is easily achievable, what can be used as reference in 
managing the data in the beta space. This method 
gives an ad-hoc estimation for every pixel, although it 
inherits all the systematic uncertainties from the Mon- 



tecarlo simulation. Results reported in the section IVl 
are obtained in this way. 

Simulation is the main way, but not the only one. 
If what the experimenter considers noise [l6| is uni- 
formly distributed in the sky region under considera- 
tion, then the /3jk average and r.m.s. of may be es- 
timated in a small peripheral corner. This technique 
has the advantage to estimate the background directly 
from data, but values obtained are the same for all 
pixels considered. Results reported in the section |V] 
are obtained in this way. 

Once the variances <j^^ = {{Pjk — {Pjk)Y) are ob- 
tained, they can be used to evaluate the significance 
of the observed signal in the beta-space, i.e. to im- 
plement the detection procedure directly in the real 
space. 



IV. THRESHOLDING 

For the most part, analysis coincides with select- 
ing only those Pjk which are above a certain thresh- 
old {thresholding). Such threshold can be defined on 
the basis of local criteria (e.g. pixels are kept only 
that are within ip from the considered region, block 
thresholding)^ as well as statistical hypotheses. For in- 
stance, the variance-maps (t|^, however estimated, can 
be used as term of comparison, to get rid of the non- 
significant coefficients. We define here a soft thresh- 
olding technique whose results will be presented in the 
next sections. Let us introduce the weight functions 

where T is a certain threshold, passing which the 
weight function goes from to 1. The L parameter is 
related to the width of the transition region. 



V. ARGO-YBJ AND FERMI-LAT 
SIMULATION 

The first application of needlets to real high en- 
ergy astrophysics occurred for the ARGO-YBJ exper- 
iment [10]. ARGO-YBJ is an air shower array able 
to detect the cosmic radiation at an energy thresh- 
old of a few hundred GeV. Here we reproduce the 
analysis reported there, by simulating data from three 
years of data acquisition of the ARGO-YBJ detector. 
The trigger rate, the duty-cycle, the field of view and 
the angular resolution have been suitably reproduced. 
Charged cosmic rays have been simulated isotropi- 
cally, according to what reported in [Tl^ ; intermediate 
scale anisotropics have been added according to [1]. 
As far as TeV gamma-ray sources are concerned, the 
Crab nebula, Mrk421, Mrk501, MGRO1908 and the 
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Cygnus region were included in respect of the aver- 
age spectra reported in the hterature. No large scale 
anisotropics is there. 

To give an idea of what the method might obtain if 
applied to satellite-borne experiments, needlets have 
been applied also on a simulation of 1 year of data 
taking of Fermi-LAT. All 1451 sources present in the 
IFGL catalogue have been simulated. The galactic 
diffuse emission has been simulated using a Fermi 
collaboration template [17] and the extragalactic dif- 
fuse emission has been sampled with spectral index 
-2.4. Neither electrons or albedo photons have been 
accounted for. 

There is some oversimplification in neglecting the 
detector response function, as well as in considering 
the exposure to be uniform all over the sky. No deep 
study of denoising properties have been performed yet, 
then this work has to be considered as preliminary 
with respect to other similar wavelet applications to 
cosmic ray physics (see for instance [III). 



VI. RESULTS FOR THE ARGO-YBJ 
SIMULATION 

The figure [2] represents the beta-map obtained for 
the ARGO-YBJ simulated map, order 5. As we expect 
from the corresponding angular scale ((1.6)^ ~ 17°), 
intermediate scale anisotropics [1] are well visible. 



ARGO-YBJ sky-map 




FIG. 2: Beta-map from the simulated ARGO-YBJ sky. 
Mexican needlet transformation with 5 = 1.6, p = 1, 
J = 5. The color scale indicates the intensity of the beta 
coefficients. Mollweide projection, equatorial coordinates. 

To evaluate the significance of the signal with 
respect to the estimated background, we used an 
independent background estimation Bk as mean 
map and sampled 1000 random background copies 
S^, i = 1, . . . , 1000; every map has been needlet- 
transformed to compute the variance cr|^ These maps 
were directly used to calculate the significance of the 
content of the k^^ bin in the j^^ beta- map: Pjk/cFjk- 
For instance, the figure [3] shows the distribution of 
l39k/o'9k' The curve represents the gaussian fit to the 
/^9/c/c"9/e distribution, where P^k is the transform of 
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s.d. 

FIG. 3: Significance distribution of the pixels of the 9*^ 
order beta-map from the simulated ARGO-YBJ sky. 

a background random copy (background subtracted). 
The fit results are /i = -0.0005 ± 0.0011 and a = 
1.0014 ±0.0021, as it should be (xVd.o.f. = 38.9/37). 
The distribution of the signal map significantly devi- 
ates from the gaussian trend. The contribution of the 
sources is well visible from 3.5 up to 10 s.d. There is 
also a small excess below —3 s.d., whose origin is due 
to the shape of the needlet function. 

As anticipated in the previous sections, this method 
allows to estimate the significance of every excess di- 
rectly in the P space, with no need of signal-smoothing 
or averaging. Moreover, it must be recalled that a 
given signal may be there in two or more j orders 
and that the significance estimated here refers to the 
single- order. 

Finally it comes an example of application of the 
"soft thresholding" method. We set T = 3 and 
L = 0.2 for orders j = 5 - 11 (ill of the Mexican 
needlet transform {B = 1.6 and p = 1), then got 
the reconstructed map. These choices are optimized 
for point-like sources. A zoom of the result around 
the Crab nebula is given in figure SI The source is 
well visible and no fluctuations typical of images from 
EAS arrays are there. The number of events recon- 
structed within 5° from the nominal source position is 
(114 ±3)10^, to be compared with that obtained from 
standard analysis techniques (e.g. [14]) (116 ± 2)10^ 
and that simulated 115234. 



VII. RESULTS FOR THE FERMI-LAT 
SIMULATION 

The figure [5] reports two examples of needlet anal- 
ysis applied to the Fermi-LAT simulation aforemen- 
tioned. Mexican needlet transform has been applied, 
with B = 1.6, jo = and nj = 16; p = 1 as usual. 
If we group the beta-maps in low and high multipole 
cont ribut ions, j = 1 — 7 and j = 8 — 15 respectively, fig- 
ures [5(a)] and [5(b)] are obtained. The former sky map 
shows the diffuse emission along the galactic plane, as 
well as extended contributions from the most powerful 
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ARGO-YBJ: zoom on the Crab nebula (N>40) 

;edlet reconstruction: B=1.6, j = 6-ll, s.t. [3. ,0.2] 




i 



FIG. 4: Signal of the Crab region reconstructed with the 
Mexican needlet technique. Parameters of the transform: 
B = 1.6, p = 1, j = 5 - 11, soft thresholding [3.0, 0.2] 



point sources (Cygnus region, Vela, Crab). Therefore, 
needlets reveal themselves as a promising tool for in- 
vestigating diffuse emissions too, treating poin t-like 
sources as noise. On the other hand, in figure 5(b) 



these sources are well visible, together with some noise 
at high angular frequencies, to be got rid of via thresh- 
olding algorithms (see ahead). What is most notice- 
able is that no trace of diffuce emission around the 
galactic plane is left: aside the point sources within 
h = ±5° no events lay in the reconstructed map. 

To finish with, we report here one more application 
of the soft thresholding algorithm. It has been ap- 
plied to retrieve sources in the Cygnus region, which 
is known to have quite a complex morphology. The 
figure 6(a) reports ah events that have been simulated 
therein: diffuse contributions and point sources. The 
beta-maps have been filtered estimating the average 
and the r.m.s. in a corner of the region under consid- 
eration (see the caption for details). The figure [6(b)] 
reports the backtransform map realized with orders 
jf = 9 - 14 after threholding with T = 5 and L = 0.2. 
It can be appreciate how the sources appear to be well 
separated from each other and very few residual noise 
events remain. On the other hand, the figure |6(c) 



reports the sources as they were pre-set in the simula- 
tion (i.e. without the diffuse background). Although 
caution is opportune because of the strong oversimpli- 
fication of the analysis, the comparison of this figure 
to the previous one is a strong indication of how pow- 
erful the needlet approach might be in resolving point 
sources in crowded regions. 




(b) 



FIG. 5: Mexican needlet backtransform of the Fermi-LAT 
simulated dataset. Parameters of the transform: B — 1.6, 
used j 



P 



(a) 



(b) 



15. The 



1-7. 

color scales indicate the number of events reconstructed in 
each map (scale expanded in 
Galactic coordinates. 



used j 
events r 
Mollweide projections. 



VIII. CONCLUSIONS AND PERSPECTIVES 

Recently, needlets drew the attention of the scien- 
tific community for their important applications in 
data-analysis of cosmological data as a new form of 
spherical wavelets. We presented here some appli- 
cations of the needlet transform to high energy cos- 
mic ray physics, showing the very good properties 
of localization. The needlet transform is sensitive in 
the whole harmonic domain, provided that enough 
orders of needlets are computed. In particular, the 
application to a simulation of the ARGO-YBJ and 
the Fermi-LAT data-sets found again the well-known 
intermediate scale anisotropy at low orders and the 
point gamma-ray sources at higher orders, thus show- 
ing the possiblity of decting sources directly in the 
needlet space. The significance of the beta-maps is 
carried out quite easily if the background distribu- 
tion is known and the variance of the beta coefficients 
may be used to threshold the signal in the beta space, 
which turned out to be a very promising technique. 
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(76.9, 1.5) Galactic 



(a) 

Cygnus region: mexican needlet anti — transform 

(B=1.6, j=10-14, soft thresh. [5,0.2] r.m.s.) 
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Cygnus region 

simulation: iFGL 
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FIG. 6: Mexican needlet application to t he C ygnus re- 
gion in the simulated Fermi-LAT sky map. (a) all events 
simulated (point sources plus diffuse emission) . (b) back- 
transform map (used j = 9 — 14, block thresholding in 
(ra.,de.)= (299.5°, 39.0°), radius 2°). [(c)] pre-set sources. 
The color scales indicate the number of events. Mollweide 
projections. Galactic coordinates. 
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